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We consider the problem of detecting a burst signal of unknown shape in the data from gravitational wave 
interferometric detectors. We introduce a statistic which generalizes the excess power statistic proposed first by 
Flanagan and Hughes, and then extended by Anderson et al. also to a multiple detector case. The statistic that 
we propose is shown to be optimal for arbitrary noise spectral characteristic, under the two hypotheses that the 
noise is Gaussian, albeit colored, and that the prior for the signal is uniform. 

The statistic derivation is based on the assumption that a signal affects only affects samples in the data 
stream, but that no other information is a priori available, and that the value of the signal at each sample can 
be arbitrary. This is the main difference from previous works, where different assumptions were made, like a 
signal distribution uniform with respect to the metric induced by the (inverse) noise correlation matrix. The 
two choices are equivalent if the noise is white , and in that limit the two statistics do indeed coincide. In the 
general case, we believe that the statistic we propose may be more appropriate, because it does not reflect the 
characteristics of the noise affecting the detector on the supposed distribution of the gravitational wave signal. 

Moreover we show that the proposed statistic can be easily implemented in its exact form, combining standard 
time-series analysis tools which can be efficiently implemented, and the resulting computational cost is still 
compatible with an on-line analysis of interferometric data. 

We generalize this version of an excess power statistic to the multiple detector case, considering first the noise 
uncorrelated among the different instruments, and then including the effect of correlated noise: we show that 
this can be done either perturbatively, or in exact form. 

We give full details about the implementation of the algorithm, both for the single and the multiple detector 
case, and we discuss exact and approximate forms; the choice among them depends on the specific characteris¬ 
tics of the noise and on the assumed length of the burst event. 

As a example, we show what would be the sensitivity of the network of interferometers to a 6-function burst. 

PACS numbers: 04.80.Nn, 05.45.Tp, 07.05.Kf 


I. INTRODUCTION AND SUMMARY 


Several large scale interferometric detectors 10111 are 
currently under commissioning and are expected to start data 
acquisition and reach their design sensitivity in a few years. 
Some of the candidate sources, like the coalescing binaries in 
their inspiral phase, can be modeled with reasonable accuracy 
and the gravitational waveforms can be predicted, thus allow¬ 
ing a matched-filter detection strategy; see |Fj| and references 
therein for a review. On the other hand, as argued in |6j] it 
is conceivable that the uncertainty on the waveform will re¬ 
main high for sources like the Type II supernova explosions 
or the merger phase in the coalescence of black holes or neu¬ 
tron stars: in this context, the issue of detecting events poorly 
modeled or not modeled at all remains crucial. 

The problem has already been faced from different point 
of views: some authors ||| [7| j§|] aim at devising several sim¬ 
ple and computational inexpensive algorithms, to be run in 
parallel after having been tested and optimized against model 
waveforms |:9j|. Others start from general hypotheses on the 
distribution of the signals and the noise and derive statistics 
optimal under those assumptions 10000 - 

We consider of particular interest strategies, like the excess 
power statistic proposed by Flanagan and Hughes [pi]], or the 
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norm filter studied by Arnaud et al. [ 10|, which try to make 
minimal assumptions on the nature of the signal, like time 
duration and bandwidth only: in particular the excess power 
statistic has been recently analyzed by Anderson et al. 
and extended to the “blind” search of burst events from a net¬ 
work of interferometers. However, as pointed out in [jl3[ note 
8], the authors have actually made the assumption that the sig¬ 
nal distribution is flat with respect to the inner product defined 
by the inverse of the noise correlation matrix. They recognize 
that this is an approximation, and correctly claim that it is le¬ 
gitimate when the noise spectrum does not vary rapidly in the 
band of interest: however we will argue in Section IIA that 
this may be not true for real detector noise, and we will show 
that in view of the current models for burst signals from the 
core collapse of supernovae § 0] the correlation length of 
the detector noise cannot be assumed short with respect to the 
event duration, even assuming the design noise. 

The choice of the signal distribution in | [i~3| | has the ad¬ 
vantage of making easy to incorporate a priori informations, 
when available, about the absolute scale of the expected sig¬ 
nals: we shall see that this is in general more complicated 
with our statistic, if detection thresholds were to be set using 
a Bayesian criterion. 

The analysis method that we propose consists essentially of 
two steps: 


A filter the input data with a matched filter for 8 functions. 

B Compute a statistic similar to the energy of the data within 
each time window we are willing to test for the pres- 
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ence of a burst, using a particular scalar product which 
can be conveniently computed either using the discrete 
Karhunen Loeve transform (DKLT), or (approximately) 
using a discrete Fourier transform (DFT). 


The algorithm can be easily generalized to the multiple de¬ 
tector case, resulting in a optimal statistic which depends on 
the direction in the sky the signal is supposed to come from. 
It also turns out that possible correlations among the detector 
noises can be taken into account in a natural way by modifying 
the 8-filtering step, and that these modifications are a natural 
consequence of the likelihood maximization procedure. 

The paper is organized in two main sections: in Section II 
we derive the statistic for the case of a single detector: then in 
Section 1 we extend it to the multiple detector case. 

In greater detail, the plan of the work is as follows: in Sec¬ 
tion IIA we motivate our study, introducing our hypotheses 
on the signal and the noise and discussing them in view of 
the current models for supernovae signals. In Section IIB we 


briefly recall the Bayesian fram ewor k we follow in deriving 
the optimal statistic. In Section 1IC we consider the simple 
but illuminating case in which the burst event is a 8 function, 
that is with a duration affecting only one data sample: we de¬ 
duce a detection strategy that corresponds to one of the data 
analysis met hods used in resonant bar experiments [[l5|] . In 
Section 1 1 D| we derive the exact expression for the likelihood 
ratio, for a burst affecting N samples in the data stream, and 
we show that the calculational ste ps a re essentially the two 
A,B mentioned above. In Section IIE we make a digression 
on the Karhunen-Loeve transform, a tool well known in statis¬ 
tics [[T^, [p7]| and already applied in the analysis of data from 
bar detectors (Jl8|] , and prop osed for the study of narrow reso¬ 
nances in Section [ll l] we apply the DKLT expansion to 
our problem and we show that it is a convenient way to imple¬ 
ment step B. When the supposed burst length Ni\ is large, an 
approxim ate for mula using the DFT can be used, as discussed 
in Section IIF 2. 

The very meaning of the word “optimal” u sed f or defining 
the proposed statistic is discussed in Section IIG , where we 
clarify the limits of the method, in particular with respect to 
the possible inclusion of prior information on the strength of 
the signal. 

We elaborate in Section IIH on the details of the detection 
algorithm, and we show that step A is easily implemented 
using standard tools of time series analysis, including the 
whitening transformation, although the required operation is 
different from a whitening. 

The distribution of our statistic in absence and presence of 
a signal is considered in Section 111: we show that it corre¬ 
sponds exactly to a X 2 variable, respectively central or not 
central. 

The multiple detector case is treated first in the approxima¬ 
tion of uncorre lated Gaussian noise across the detector net¬ 
work in Section III B, where we show that the techniques used 
in the single detector case can be easily extended, in a way not 
much different from what has been done in [|i~3|l, but including 
the above mentioned step A implementing the filtering for 8 
functions. We show in Section III B 1 that the step B of the 


resulting algorithm can be written in exact form using a vec¬ 
tor DKLT, while a simpler form, similar to the one derived in 
a is valid in the long burst limit, as discussed in Section 


III B 2. 


The case in which the noise of different detector s disp lays 
some degree of correlation is considered in Section III C : we 
start from the same hypotheses as Finn [|2^| and we show how 
the effect of the cross-detector terms in the noise correlation 
matrix can be taken into account in our algorithm, either per- 
turbatively, if the cross terms are small, or exactly if they are 
not: not surprisingly, the needed modifications turn out to af¬ 
fect only the 8-filtering step, and in a simple way. 

We finally give an example of application of the algorithm 
to the detection of bursts of unit duration, for a network of 
detectors comprising either the three LIGOs |[l|] or including 
also GEO600 [g], TAMA [g] and Virgo [g]. We compute the 
resulting SNR, which is a function of the direction in the sky; 
this allows us to pictorially show to which extent the network 
analysis strategy is advantageous, at least in the ideal situation 
in which the detector noise is Gaussian. 

Throughout all the paper we adopt a discrete-time, discrete- 
frequency notation: the conventions followed for the Dis¬ 
crete Fourier Transform require some care and are detailed 
in App. [a], while the characteristics of the detectors in the net¬ 
work are detailed in App. |b| 


II. SINGLE DETECTOR ANALYSIS 
A. Noise and signal statistics 

We will keep consistently a discrete time, discrete fre¬ 
quency notation, assuming a sampling rate f s and a finite ob¬ 
servation time T = N/f s . We assume that the detector noise 
has zero mean and is Gaussian, albeit colored, characterized 
by a correlation matrix 

( R «) [i,j] =E[niiij\ (1) 

where n, = n[i\ = n (i/fs), and i £ [1,IV]. We further assume 
that the noise is stationary, hence R is a symmetric Toeplitz 
matrix & whose entries depend only on the difference of 
indices. In terms of this correlation matrix, the probability of 
observing a certain set of noise data n (of total length N ) is 
given by the joint distribution 


P(n) = 


\J (27t) A 'det/? 


exp 




( 2 ) 


if a certain signal s is also present, the conditioned probability 
of observing a set of data x is 


P(x| s ) = 1 p[-;( x ~ s ) R„ *■(*-■)]. 

^/{2n) N detR 


(3) 


We should stress that formulas in Eqs. (g|g make use of the 
information available in the finite data sequence: we observe 
only N data and with the expression of F(n) and F(x|s) we 
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cannot take into account the effect of the past data points, 
which fall outside our observation window [j45|]. We shall see 
later that when considering a shorter analysis window (of size 
N\ i) contained in a longer data train we should actually ex¬ 
ploit also the information contained outside the Ni\ window. 
This information is of little relevance if the analysis window is 
much longer than the largest correlation times in the noise: but 
this is not generally the case when considering burst events. 

What matters in deciding what is the relevance of this 
boundary effect is the noise spectrum: one knows indeed (see 
Section |a|) that 


R 


a — b 

— 


s ( r . _I U = 7 vE 


2 N—2 gi2Tik(a—b) 


Ub f s N £1 S„[k] 


(4) 


and considering a noise model as given in Eq. ( jB 16 ). which 
summarizes the best sensitivity reachable in the first genera¬ 
tion interferometers, one deduces that 


(2) instances of these statistics, derived from these data, will 
exhibit a correlation in time which will have to be kept into 
account when computing false alarm and false dismissal prob¬ 
abilities [[m]]. In the present paper we will concentrate on the 
first issue. 

One general way, other than subtracting the narrow spec¬ 
tral components, to attack this problem would be to assume 
that data have been pre-whitened [||, [7} :8, [u]|, for instance 
usinga time-domain filter estimated from the data themselves 
| [2^ , p6| ] : however this strategy requires to take into account 
the effect of the whitening filter throughout the whole detec¬ 
tion chain, in particular the alteration of the signal waveform 
and consequently of the signal distribution. In other words, 
when integrating over the space of possible signals, we are not 
allowed to ignore that the measure is changed by the transfor¬ 
mation. 

To render our statements more precise, we need to discuss 
in detail the detection framework adopted. 


P _1 (t) ~A 0 e‘ l ' l/To cos27t/ 0 T + ... (5) 


B. Detection framework 


neglecting faster decaying terms. The decay time(s) To,i,... 
characterize how much the matrix R 1 (and therefore R itself) 
differs from a diagonal matrix. In Section B 2 and in Table [TTT] 
we show that for the current models of the baseline interfer¬ 
ometer noise in the first generation detectors the values of To 
range from 1.4 ms in TAMA to 6ms in Virgo (corresponding 
to 0(100) samples). 

These time scales should be compared with the expected 
duration of the bursts: for instance Zwerger and Muller [^| 
Figs. 5,6] have shown several examples of gravitational wave¬ 
forms emitted in axisymmetric core collapse events, display¬ 
ing large variations on scales of a few ms, and narrow large 
amplitude peaks even shorter than 1 ms. The same features 
are found in more recent simulations, which include relativis¬ 
tic effects, by Dimmelmeier et al. [0 Fig. 2 (Model A)]: we 
conclude that it is generally not justified from the physical 
point of view to surmise that the correlation decay times of 
the noise are short compared to the burst duration. 

Moreover, the values for the decay times quoted in Table @ 
refer to a ideal detector noise, free of narrow resonances: real 
detectors may well exhibit richer spectral features [^]. For 
instance a thermal resonance with proper frequency /o and 
quality factor Q would contribute to the noise correlation a 
term with a characteristic decay time Id = with violin 


modes easily having Q > 10 5 , and frequencies /o = 0(1000) 
Hz, the decay time can easily reach tens of seconds. Although 
it is foreseen to subtract the effect of these resonances from 
the data, using for instance Kalman filters [^3]] it is fair to 
say that any residual effect, due for instance to a imperfect 
cancellation of a very high Q resonance, will contribute to 
increase the noise correlation length above the values deduced 
from the baseline noise. 

There are at least two important consequences of the pres¬ 
ence of a non-zero correlation length: (1) statistics built using 
maximum likelihood criteria must be modified to take into ac¬ 
count noise outside the window affected by the burst event; 


We suppose that, when present, the burst affects only a in¬ 
terval Iji = Nu/f s starting at absolute time fburst- that is a num¬ 
ber An of samples in the data stream. We are unable to pre¬ 
scribe any prior for the signal amplitudes, so we treat it, at 
any given instant, as nuisance parameters, to be integrated out. 
Anderson et al. | [TT| |l3| ] have made similar assumptions, but 
as we have already anticipated with a different hypothesis on 
the measure for this integration. 

Given a data vector x, the a posteriori probability of having 
observed those values can be written as [El] 


P(x)=P(x|l)P(l)+P(x|0)P(0) (6) 

where as usual Pi 1 ),P(0) are the a priori probabilities for 
a signal of unspecified form being present in the data, and 
P(x|l) is the probability of observing the dataset x given that 
some signal s is present, while P(x|0) is the probability of ob¬ 
serving the same dataset x in absence of signals: it was defined 
in Eq. (^j). In turn we have that 

P(x|l) = J P(x|s)P(s) t/s (7) 

where P(s) is the a-priori probability of having a signal s 
present, while P(x|s) has already been defined in Eq. (|3|). We 
may or may not have a good guess for the distribution P(s): 
in this paper we assume to have no a priori information. 

Given our complete ignorance of P(1),P(0), we resort to 
defining the (integrated) likelihood ratio that any signal is 
present as 

A(x) = = / e-j sR » ,s+sR » lx P(s)Js; (8) 

1 | U J J 

in terms of A and using the Bayes rule P(l|x)P(x) = 
P(x|l)P(l) we can write 


P(1|X) A(x) +P(0)/P(1) ’ 


(9) 
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for the probability of having observed a signal, conditioned 
by the particular instance of data x that we have received. Al¬ 
though we have no idea of the priors P(0),P(1), this proba¬ 
bility is a monotonic function of the likelihood A(x), which 
is therefore the quantity to be estimated, in dependence on the 
assumptions (or lack of assumptions) on P{ s): to implement 
our complete ignorance on the waveform we will assume that 
P (s) is flat in the space K' v of possible signals of length N\\. 

For the sake of comparison, Anderson et al. @ Eq. 3.1, 
and the following discussion and note] have assumed that sig¬ 
nals s are a priori distributed in a uniform way with respect 
to the metric induced by the scalar product (x,y) = x • R~ 1 • y: 
this is the essential reason why our results are different from 
theirs. 


C. Burst of unit duration (8 function) 


[(w;,-z)j 


which can be regarded as a energy, but takes into account the 
altered signal shape (a 8 function in this case) under the appli¬ 
cation of the transformation W„. 

In passim it is worth noticing that we wrote the distribu¬ 
tion d{)(L) assuming implicitly to compute the moments of 
our statistic over a ensemble of noise realizations; in practice 
however one takes samples from a single time-series, at dif¬ 
ferent locations in the data stream. This means that successive 
instances of the statistic L, for different indices a, will belong 
to a joint distribution, which does not factor in a product of 
terms like the yj in Eq. [Tt]: for instance we can consider the 
cumulant 


Let us first consider a extreme example, namely a burst 
affecting only one of our data samples: its amplitude is un¬ 
known, and we want to test the hypothesis that the event oc¬ 
curs at the sample arbitrarily labeled a. Then the integrated 
likelihood ratio in Eq. (j8|) is 


f\(xj — j (> ' AiV*+V ; 1 -xj„ 


= exp 


1 [( R ,7 


2 (R7 1 ),, 


( 10 ) 


where no summation over a is understood. The interesting 
statistic for a certain possible arrival index a is 


E [L a ( x)L b (x)] 

E [L a (x)]E [L h (x)] 

which exposes that the statistic L a as a function of the sup¬ 
posed burst event location a has a non-negligible cross cor¬ 
relation in colored noise. This should be always kept in 
mind when post-processing the results of the analysis, for in¬ 
stance when deriving limits on false alarm or false dismissal 
rates p4||: we will not elaborate further on this. 

We now proceed to generalize the statistic L to a burst of 
arbitrary length. 


ter 1 ) 


ab 


te 1 ), 


(15) 


L a (x) = 21nA(x) 


fj_ 

N 


V JlTiak/N 
Lke 




(ID 


and is readily identified as the Wiener filter for a 8 
function |[i~5||. In absence of signals its distribution is 


do(L) 


V2nL 


e 2 




( 12 ) 


as expected for the square of a Gaussian variable; note that the 
statistic L is not equivalent to taking the energy of the signal. 

Nor is it equivalent to first whitening the data and then tak¬ 
ing the energy: indeed whitening would mean, in matrix nota¬ 
tion, to perform a lower-upper (LU) factorization of the matrix 
R 1 in the form||46| 

R;' = W'„ ■ W„; (13) 


where W„ is a lower triangular matrix which defines a causal 
transformation, dependent on the noise statistics: the matrix 
W„ defines a whitening transformation, namely the random 
vector z = W„ ■ x is distributed as white gaussian noise. The 
statistic L a (x) can also be written in terms of the whitened 
data z 


D. General burst 


It is useful to introduce a vector space notation: let us call 
the vector space of all possible data vectors, having length 
N; we are willing to test for the hypothesis of presence of a 
burst signal in a certain subspace ■F||, defined by taking /V -C 
N consecutive samples, from a certain starting position (say, l ) 
in the original vector. The orthogonal subspace, of dimension 
N — ~ N, will be called 'Ey. 

We have again the likelihood ratio 

AT|| 

A(x)= .‘-JO/JJdsi (16) 

i 

where the indices i,j run only over elements of however 
the noise correlation matrix R„ is defined for an arbitrary in¬ 
dex difference. Let us introduce the matrix 

( R 7 1 ) I! = ( R ,7 1 ) [l:l+N burst , l:l+N bursl \ (17) 

obtained restricting the indices of the R„ 1 matrix on the [1,1 + 
Nburst] interval: it acts only on the +« subspace. 














5 


Performing the Gaussian integrals over amplitudes .v, of the signal, which we treat as nuisance parameters, we obtain 

-l 


A (x) exp 


(R,; 1 -*)n• {( R n l ) { y • (R,; 1 -*)n =ex P l - Xa (R- l ) ai (((r»O,,)' 1 ) (r,7 Vp 


(18) 


where indices a, (3 run in 'Pm + and indices i. j run in ‘P ; the overall normalization does not depend on x and can be ignored. 

I- 


We are forced to this somewhat cumbersome notation because 
the operations of projecting over the “burst” subspace 'Pm and 
of inverting a matrix do not commute. They do only when 
N = A||: in that case one would have 

A(x)oc e i xR M lx = e 2( w »x) 2 (19) 

where we used Eq. (]5 a|), and W„ • x is distributed as white 
gaussian noise. This means that in the case N = A|| the 
likelihood would be just a monotonic function of the energy 
(W„ • x ) 2 of the whitened sample. Note however that this ex¬ 
ample is very different from the previous one: there we had a 
unit burst in a long data train, here we would have a burst as 
long as the data train: both are extreme cases. 

We will from now on instead take as optimal statistic for a 
unknown burst, with flat prior for the sample amplitudes, the 
expression 

L(x)= £ (R,7 I x) i (((R ) ; 1 ) || )' 1 ) (R(20) 

ije i'll x ' ij 

The reader might wonder if this is of any practical use. In par¬ 
ticular, the estimation of the matrix ^ (R, 7 1 ) || ^) l°°ks awk¬ 
ward: we should estimate R„, then compute R (J 1 , then take 
a minor Am x A|| along the matrix diagonal and finally invert 
it@. 

Let us however notice the trivial identity 


(R,7 1 n).(R,7 1 n) 


E(R7 1 ) ! a( R 7 1 );p £ M 

ocp 

(R 7 % ( 21 ) 


in other words, considering the time series R„ 1 • n, which is 
R 7 1 -x in absence of signal, and restricting it to the 'Pm sub¬ 
space, we obtain an autocorrelation matrix that is just R 7 . 
Barring boundary effects on the matrix R„, also the process 
R, 7 1 ■ n is shift invariant, and R 7 1 is a Toeplitz matrix when 
considering diagonal minors sufficiently far from the borders. 

We are therefore able to easily compute R 7 1 just from the 
analysis of the series R 7 1 n: the last step, in order to write 
down the statistic for a burst in a certain subspace 'Pm, is to 
restrict the matrix to that interval and invert it: a efficient tool 
to accomplish this task is the Karhunen-Loeve expansion |[P7||, 
which we find useful to recall briefly in the next section. 


E. Karhunen-Loeve expansion 


In Section IIC we have exploited the LU factorization of a 
correlation matrix R. It is also well known [|T^, [ 77 ], [2TJ that. 


being R real symmetric and positive definite, an expansion in 
terms of its eigenvalues and eigenvectors exists, namely 

K 

fiap = £ <Wa¥p (22) 

k= 1 

where K is the dimension of the matrix and 


fiapVp = <Wp (23) 


with eigenvalues G& > 0. The \ \\l k . k G [1, A]} 
are chosen orthonormal 

eigen-vectors 

E¥aVa = 5 H ; 

a 

(24) 

and define a basis in the space M K : any data vector x can be 
written as 

K -1 

x= £ qV, where q = x ■ l|/ ; 

k= 0 

(25) 

this decomposition is called discrete Karhunen-Loeve trans¬ 
form (DKLT). As with the Fourier transform, the Parseval’s 
theorem holds 

II 

(26) 

and it is immediate to show that 


E[ckci] = a k hi 

K K 

E [x • x] = £ Raa = £ Ok ■ 
a=l k= 1 

(27a) 

(27b) 


The similarity with the Fourier transform goes further: it can 
be shown (see for instance | [~i~7| . sec. 4.7.2]) that in the limit 
of large K the basis elements converge to sines and cosines, 
and the eigenvalues converge to the corresponding bins of the 
spectral density. 

However, for finite K the DKLT is a better representation 
for the noise because it takes into account the finite-size ef¬ 
fects |[77ij|: recall that we are interested in A|| not necessarily 
large. In particular, the coefficients c* are uncorrelated ran¬ 
dom variables, thus making the statistical analysis easier. 

We are naturally led to apply the DKLT to the problem at 
hand. 
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F. Exact expression for the burst statistic 


In Section IIP we have shown that the exact statistic, de¬ 
fined in Eq. (j2(jj), can be expressed a 

-l 


L = y\\ 


(R.v)i 


•yii 


(28) 


where yp = (R„ 1 - x),,. 

The vector y itself can be easily computed: recall that we 
have assumed N large, hence we can write 


K: <*» 


where S n is the one-sided spectrum corresponding to the cor¬ 
relation matrix R„. 

Thanks to the KL transform we are now also able to write 
down explicitly the inverse of the correlation matrix R v of the 
y time series, keeping into account the restriction to the V\\ 
space, in terms of a appropriate DKL basis \|/n , k £ [l, N« ], as 



hence we finally have 


*11 i 

l= y — 

tx a* 



(30) 


(31) 


Note that the matrix R v , as we have seen in Section [ID, is 
approximately a Toeplitz matrix: this means that (RA, does 


not depend on the segment, in the data train, chosen to test for 
the presence of bursts [p8|[. 

The reader might wonder why the DKLT is at all neces¬ 
sary: our statistic in Eq. (^8j) could be computed just inverting 
the matrix (RA,, and then applying it to each successive data 
chunk; what is the advantage of the expression in Eq. (|3~i|)? 
The answer is that the computational cost is the same, but 
the DKLT decomposition gives us more flexibility. We are 
not forced to sum over all the elements: we can decide for 
instance that some of the basis elements correspond to large 
noise components, and can be left out without significantly af¬ 
fecting the detector performance. The fact that the coefficients 
\|ffj ■ y||, for different values of k, are by definition statistically 


uncorrelated renders this procedure sound and does not com¬ 
plicate the statistical analysis . 

We will e laborate more on the practical implementation in 
Section IIH : we now turn to consider two special cases which 
help building a better understanding. 


1. A special case: N = 1V|| 

This is unrealistic: in this case there is no orthogonal space, 
(R- 1 ),, = R.T 1 and the likelihood ratio becomes 


A(x) = exp 


1 


:X R„ 


(32) 


if |<)> fc , k = 1, IV j is the appropriate DKL basis for the noise 
correlation matrix R„, that is 

N 

R„ = !>*<!> W; (33) 

k= 1 

one would therefore use as a statistic for burst searches 

'•=e^(H 2 - (34) 

This expression corresponds closely to the excess energy 
statistic defined in[[T3j Eq. (1.4)], with two differences: (1) 
by using the DKL expansion it takes into account the finite 
size of the sample; (2) it is shown to be appropriate, given 
our assumptions on the signal distribution, only in this very 
special case when no information is available before and after 
the data segment we are searching for a burst. The correspon¬ 
dence makes more evident why our result differs from theirs: 
having they chosen a uniform prior with the noise metric, they 
have effectively decoupled the Lx, L|| subspaces, which is 
equivalent in our context to neglecting the presence of corre¬ 
lated noise. 


2. Approximate expression: 7V|| large 


Let us go back to the case in which we search for a burst of 
length N - in a much longer data train of length N: as we said, 
sufficiently long to resolve the narrow spectral features which 
give rise to long correlation times. The high sampling rate 
(9(20kHz) needed to exploit the wide spectral range available 
in interferometer data may lead to consider N\\ of the order of 
several hundred samples, even for signals of a few tens of ms. 

If this results in a excessive computational cost in the ap¬ 
plication of the KL transform, we can exploit its convergence 
to the Fourier Transform, in the limit of N large: namely, we 
can write down approximately 



N \r 2 ? 

r 1 Y —- 

Js h s#] 


Wfc® wf 


(35) 


where w/. are the Fourier basis vectors (see Section [a]) in L|, 
approximating the DKL transform for (R v )|| with eigenvalues 

h=^y[k]f s . 

It follows the expression for the approximate statistic 


i(x) “ f7 ‘ I, ^J |w ‘ y " 12 

f "V 2 - 1 1 

= £ I AAiiWI 2 


-1 Sy[k] 


(36) 


L(x) is the sum of the squares of the Fourier coefficients 
yj| [ k} = /“* • yjj of the time series y|| = (R” 1 -x^, 

weighted with the corresponding spectral noise density. This 
expression is similar to the excess energy statistic defined 
in the difference, apart the A' large approximation, 

is that we found it necessary to start from data filtered for the 
occurrence of 5. 
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G. In which sense the statistic is “optimal” 

Before moving to the practical implementation of the al¬ 
gorithm, it is very important to fully understand the conse¬ 
quences of the assumption we have made about the a priori 
distribution of the signals: to this end, let us have a second 
look at the likelyhood ratio 

A(x) = J e~? sR " ls+sR " l - x P{s)ds. (37) 

Our choice has been that P(s) = 1, which is a way to avoid 
introducing any scale in the problem, which might bias the 
analysis. There is however a drawback: larger values of s ■ s 
are favored, in fact 


where the matrix L should be determined maximizing the 
probability of detection while keeping the false alarm rate 
fixed. We were however unable to prove that this is actually 
the case, at least under certain restrictions on the form of P (s). 

Another implication of this discussion is that with our 
statistic it is difficult to set Bayesian thresholds, as derived by 
choosing a particular form for p (p), and making assumptions 
on its parameters. This is however a somewhat less crucial 
issue because thresholds may be set following a frequentist 
approach, that is by limiting the false alarm rate, as shall be 
discussed in Section III. 


H. Description of the algorithm 


ds = p N \r l dpda N]l (s) (38) 

where p = y/s ■ s and c/fly is the solid angle element in /V| 
dimensions. If we have (say) a priori information only on the 
energy of the signal, or equivalently on the distribution p (p), 
then we would like to follow the same approach as in [ |l3[ Sec. 
Ill] and write 


A ( x ) = Jp{ P)A(x|p)dp (39) 

where 

A(x|p) = J8 (p-\/s _ s)e 3S ( r " )|| s+S5 Vs (40) 

and y = R 7 1 x belongs to the parallel space 'P . Changing 
basis with a DKL transform s —> £, we obtain 

A(x|p) = 5 (p — ^/qPq)e~^ k ° k< i + ^ c dc, 

r 2 

= / e - ! rZk<sl? k +Pl.kC* c kdCl Nll (q) ( 41 ) 

where q = \j/ /; ■ yy and G/ ; are the eigenvalues of (R ; y 1 ) p. If we 
had 0 / ; independent on k, in other words if all the directions in 
the ^ space were equivalent, we could as in [|l3j Sec. Ill] com¬ 
pute the integral in closed form, by aligning one of the axes 
with the direction of the vector c. This was possible to Ander¬ 
son et al. because they had chosen a signal prior function of 
s ' (Rh)h " 1 -s. 

In the case considered by us the expression in Eq.(fflj) is 
not in general a function solely of the statistic L = L/ ' c l 
and of p, because the noise introduces preferential directions 
in the space of possible signals. 

This discussion shows that the statistic L we have pro¬ 
posed is strictly speaking optimal only for a signal prior P (s) 
constant[|f9|| : we can make the ansatz that for a more general 
prior, depending on a scalar function of s, the optimal statistic 
L might still be of the form 


L = 


Y* Lki 


CkCl 

s/OkOj 


The expressions in Eqs. ( [20| . [iT 36) define the algorithm 
for estimating the (log)-likelihood, at different approximation 
levels: we find useful to detail the procedure. 

We assume that the starting point is a continuous stream of 
data, at sampling frequency f s , whose Gaussian noise compo¬ 
nent is assumed to be stationary. The purpose of the algorithm 
is to search the stream for bursts of length N' and unknown 
shape; the choice of N\\ is arbitrary if there is no physical hint. 

The steps of the algorithm are the following: 


1 . partition the data stream into data vectors x of length 
N 'Jt> N ; each vector should overlap the following by 
a certain amount 2 M which needs tuning (it is related 
to boundary effects on the correlation matrix, which de¬ 
pend on the specific noise considered). The length N 
must be sufficient to resolve the narrow spectral features 
in the data, or in other words to ensure that the correla¬ 
tion function R„ (/ /f s ) = E [x[i] x[i + /]] is sampled over 
a sufficiently long interval, allowing it to decrease to 
zero with sufficient accuracy. 

2. Estimate the correlation matrix, or equivalently the 
sample spectrum S n [k] , over the x segments: that is, 
with frequency resolution df = jj. For instance us¬ 
ing a Welch’s overlap and save procedure to combine 
estimates from different vectors of length N, or av¬ 
eraging estimates obtained with a multi-taper spectral 
estimator|[28||. The cost of this step, per vector, is 
0(NlnN). 

3. For each data vector x, estimate a new vector y = Ry 1 • 
x; this is equivalent, as we have seen in Section |ll Cj , to 
filter for 8 functions. It can be done at least in two ways, 
not necessarily equivalent from the numerical point of 
view: 


(a) in the frequency domain, Fourier transforming the 
data x —> x and defining 

(43) 


(42) 


the computational cost is again 0(NlnN). 
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(b) In the time domain, first performing a spectral fac¬ 
torization R~ 1 = W* • W„ which defines two ma¬ 
trix operators W„ and Wf,, the first causal and 
the other anti-causal. Both can be implemented 
as digital filters: the anti-causal one after revers¬ 
ing the input data. Several methods exist to per¬ 
form such a factorization and to apply the result¬ 
ing filters to the data [[T(| [l7| ]: see [25[ [2f]| for an 
application to the analysis of data from the Cal¬ 
tech 40m detector |^2|]. If the noise is stationary 
across several data vectors, the computational cost 
of the spectral factorization itself becomes negli¬ 
gible: otherwise, it can be shown [[1/7]] that the cost 
of estimating filters with P coefficients, using for 
instance the Levinson-Durbin recursion, grows as 
P 2 , where the right order depends on the spectral 
characteristics of the data (see [|2(|, where it has 
been shown that FIR filters a few hundred taps are 
sufficient for correctly whitening real interferom¬ 
eter data). Once the filters are estimated, they can 
be applied with a cost 0(NP). One possible ad¬ 
vantage of these time-domain methods, which are 
generally slower than the FFT (but can be imple¬ 
mented very efficiently on DSP systems) is that 
they can follow a slow noise non-stationarity: this 
issue however requires more study, because the 
very definition of the “burst” statistic needs a re¬ 
vision, when the noise is not stationary. 

4. Drop the first and last M points in the data vector y, 
in order to reduce the boundary effects, and partition 
it in segments yn of length A . The partitioning will 
require to overlap the yn segments: in any case one will 
have to take into account the correlation of the resulting 
statistics. 

5. Use the A — 2M data in y to estimate the correla¬ 
tion matrix R V over lags of at most Aii: this operation 
can be done using DFT methods with a cost at most 
0(N x I ii A), or smaller if we can exploit the fact that 
only (RA, is needed. 

6 . Decompose of the matrix (RA,, using either the 
Karhunen-Loeve or the Fourier transform. 


threshold on the value of the eigenvalues 0/ 0 This 
subset consists of Nkl < A elements, to be used 
in the actual evaluation of the statistic. 

(b) With the Fourier transform method (provided it 
is a reasonable approximation) we need instead 
only the spectrum of the time series y, at a re¬ 
duced frequency resolution df^ = f s /N\\- we call 
this spectrum S v [k], k £ [0, N\\ — 1]: it can also be 
computed by averaging the bins of the (S„) _1 , if 
the noise is stationary across instances of the vec¬ 
tor x. 

7. For each segment y|| assemble the statistic for the log 
likelihood, following a recipe dependent on the chosen 
method: 


(a) using the DKLT one computes 


Nkl i r 

L = E— vf k -y\\ 

£=i L 


(45) 


with a cost 0(NkiN\\ ) < 0(A?) in floating-point 
operations: this recipe gives the exact result. 

(b) Using the Fourier transform one instead first com¬ 
putes the Fourier coefficients y\\[k], a operation 
costing 0(N\\ x lnA||): then one combines the out¬ 
puts in the statistic 


L 


k 

A, 


7V||/2-t 

L 


k= 1 


1 

m 


yy Ml 2 ; 


(46) 


this expression is approximate, but its accuracy in¬ 
creases with A||, and can be legitimate in some 
cases. 


8 . Perform the statistical analysis of the results: we will 


discuss about this step in Section III 


A few comments are in order: 


• even though the procedure is designed to cope with long 
correlations in the data, we have implicitly assumed that 
no deterministic lines are present: in fact, the Wold De¬ 
composition theorem[|l7| sec. 7.6.2] states that a gen¬ 
eral random process can be decomposed in the sum 


(a) In the case of the DKLT, one needs the eigenvalues 
Gi< and eigenvectors \|/| of the matrix 

At | 

( R .v)|| = (44) 

*=t 

this decomposition has to be done at most once 
for each y vector, and possibly even more rarely, 
depending on the noise stationarity. Generally the 
cost of this decomposition is O(N0, unless one 
is able to exploit the Toeplitz structure of the ma¬ 
trix R v . One possibly selects then only a subset of 
the eigenvectors, chosen for instance by setting a 


x(t) =x r (t)+x p (t) (47) 

of a regular process]^] x r [t) and a predictable 
process |[5~i]| x p (t). The latter could correspond to a har¬ 
monic of the power line: it would contribute to the spec¬ 
trum a term 

o;,8(v-v p ) (48) 

where V p is the frequency of the line and o p its contri¬ 
bution to the RMS noise. In the sample spectrum S n [k] 
this feature would translate approximately into a term 

yh.k p = T h.k p , (49) 

JS 
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a trend in T which is the symptom of a infinite corre¬ 
lation length. Such spectral features cannot be properly 
handled by spectral factorization methods, because they 
are deterministic components and not stochastic. It is 
advisable to subtract altogether such predictable com¬ 
ponents, because this operation can certainly improve 
the signal to-noise ratio : several examples are avail¬ 
able in literature 129 3^] . Notice in passim that violin 


mode lines excited by thermal noise belong to the class 
of regular processes, although they can also be modeled 
and partially subtracted H]. 


• If the noise is not stationary, we can still perform an 
adaptive whitening, and refresh the estimate of the DKL 
basis {t|/ } each time it is found necessary. However as 
we have anticipated, further study is necessary; for in¬ 
stance, the very definition of R v as an average becomes 
uncertain when one cannot trade ensemble averaging 
for time averaging. In case of a slow non-stationarity 
one may also think of using an adaptive estimator of 
the 8 filter in Eq. (^3|); this is a good topic for future 
research work. 


and observing that 


R7 


where 


(Rv) || B 

»' (Ry)x 


A D\ 
D' C j 


(51) 


A = ((Ry)[[) 1 + 



/ \ -I , / \ 

-f 

+ 

(( R y)||) BCB f ((R v ) || J 





D = 

f (Ry) ||) BC 



(52a) 

(52b) 


and the explicit form of C is not relevant for us. Integrating 
over y one obtains a factor 


• Whether it is necessary to adopt the exact statistic in 
Eq. (|45|), or the approximate one in Eq. (p7i|), can only be 
judged comparing the receiver performance in a definite 
situation. 

We turn out to the analysis of the distribution of our statistic 
for burst detection. 


Jdy ± e~^ y±C y-L+yj-CB^R,.),) Vn 

^2 [(( R v)||) ‘ BCB ' (( R ^)||)”‘] 


(53) 


I. Statistical analysis 

Given the statistic L we need the distributions do (L) and 
d\ (L|s) under the respective hypotheses Hq (no signal) and H\ 
(a signal s of unspecified form), in order to set up a detection 
strategy, for instance based on the Neyman-Pearson criterion 

& 

The distribution do ( L ) is immediately recognized to be a 
x 2 (W|l); we find illustrative to prove this directly, from the 
definition 

d 0 (L) °c J e-2 nR "‘ n (50) 

x 8(^L-(R,7 1 n) |l ((R,7 1 )||) 1 (R,7‘ n ),|)^ n 
yVM R )0 y 8 (r —y|| ((Ry)||) y\\^dy\\dy ± 

I 


comparing with Eqs. ( ^ll|52a| ), the term in square parentheses 
is recognized to be (Rj7 1 ),, — f (R y ),, J , hence 


d 0 (L) oc J e ~b ll(( Ry )||)" ly ll 

xS^L-yn (( R v )||) y||^dy|| 

= L _ e ~ L / 2 (54) 

2 w l|/ 2 r(A|[/ 2 ) 

which is as expected the distribution of a % 2 (/V,) [|l6j. Sec. 
4-3]: we underline the role of the cross correlation effects, in 
order to derive the correct result. 


When on the contrary a signal is present, the distribution can be written as 


d\ (L\s) = H. |e“i nR '' l n 8^-(R ) 7 1 -(n + s)) |r ((R,7 1 ) [| ) _1 -(R,7 1 -(n + s)) |[ ) 


dn 


= H. / e 


J 


,-i( y - R « 1 ' s )|f(( R y)n) '{y- R ^ 1 - s )|| 


SU-yir ((Ry) 


yy ^yii 


(55) 
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which immediately results in the non-central y 2 (N ): in terms of the signal-to-noise ratio (SNR) one has 

where of-] is a hyper-geometric function |[32[ sec. 9.14].[ 


( 56 ) 


The SNR explicit form is 


R.T'-s 


SNR = 


R v)| 


R^-s 




consistently with the general definition ||21j, chap. 6]: 

\EiL\mj-EiL\Hol\ 


SNR: 


(57) 


(58) 


E {L-E[L\Ho]) 2 \H 0 


where E [L\H] is the expectation value of the statistic L under 
the hypothesis H, and H \, Ho correspond respectively to the 
hypotheses of presence of absence of a signal. Please note that 
the SNR defined in Eq. < [ss| ) has nothing to do with the intrin¬ 
sic signal-to-noise ratio which would result from a matched 

filter procedure, SNR intrinsic yj J \s(f)\ 2 /S n (/) df; in par¬ 

ticular, our SNR is quadratic in the signal amplitude. 

Given the distribution do (L) the false alarm probability can 
be readily computed as 


Qf{Eo)= [ do(L) 

JLn 


dL = 


■($.¥) 


f (t l ) 

while the detection probability 

Qd (T 0 1SNR) = f°°di (L|SNR) 

JL n 


dL 


(59) 


(60) 


cannot be written in closed form. For large SNR it is possi¬ 
ble to approximate d\ (L|SNR) with a Gaussian distribution 
having the same first and second order momenta 


r/i(L|SNR) ~ 


L-tSNR^/aV^+Vn^ 
4(W||+2SNR,/2«n 


yjMN\\ +2SNR V /Svf) 


(61) 


obtaining 


~ 2 


1 + erf 


JVn + SNRySVf—L 0 


2w7V||-f2SNR v /2A^ 


(62) 


The expressions for Qf and Qd are the building blocks for 
setting up the detection strategy: they are appropriate if we 
are able to compute the exact statistic L. How should however 
do and d\ be modified, if we have instead chosen to compute 
an approximate statistic? 


1. Approximate statistic distribution 


Recall that we have discussed in Section IIF also the pos¬ 
sibility of defining the statistic L, when computed using the 
DKL expansion, using just a subset Nkl < N« of the basis 
vectors: assuming to have ordered the \| t k vectors and taking 
only the first Nkl , we define 


Nkl i 

L = 21nA(n)=£-( ¥ i .(R,7 1 -n) ) 2 , 
k=\ ° k 


(63) 


where we dropped the suffix || from the basis vectors ¥ / ; re¬ 
call that the basis is independent on the specific N segment 
in the data train. We know that in absence of signals the 
expansion coefficients = ■^='1^ • (R -1 ' n )|| are by con¬ 
struction uncorrelated, with zero mean and unit variance (see 
Eq. (27a)); they are also Gaussian variables, because they are 
linear combinations of Gaussian variables. Hence L is dis¬ 
tributed as a y 2 (Nkl), and analogously the formula in Eq. i 
fort/i (LjSNR) holds, where Mi —> Nkl and 


i Nkl i r 

SNR^-==£- \|/ • (R„ ■ s) n 

Q2Nkl 1 Gk L 11 


(64) 


again, these results are exact, despite the fact that we are using 
only a subset of the DKL vectors. This might be useful in 
order to implement y 2 tests for non-Gaussianity, similarly to 
what has been done in the analysis of Caltech 40m data while 
searching for coalescing binaries signals |[g||. 

The other possible approximation is the use of the Fourier 
transform instead of the DKL transform, as discussed in Sec¬ 


tion [IF2. In the limit in which this approximation is legiti¬ 
mate, namely, when N\\ is large and the \| t k vectors converge 
to the vectors w / ' of the Fourier basis, the coefficients of the 
Fourier expansion behave from the statistical point of view as 
those of the DKL expansion, and the same results apply. 


III. MULTIPLE DETECTOR CASE 
A. The signal at each detector 

The mathematical tools needed to compactly describe the 
response of interferometric detectors to a coherent GW signal 
have been laid out in several papers [|d[ |35|, [?6| |, and have 
been recently reviewed and applied to the problem of network 
detection of coalescing binary signals |p7||. We collect the def¬ 
initions and formulas useful to us in Section B 1 and we refer 
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particularly to [|37|], whose nomenclature we follow closely, 
for a complete treatment. 

We describe the incoming gravitational wave by means of 
a wave frame, having the Z axis aligned with the direction of 
propagation; the signal is parameterized by two polarizations 
/i+ )X , whose definit ion d epends on the orientation of the X, Y 
axes; as in Section IID, we will regard /? + x as independent 
nuisance parameters. 

Another important frame is the network frame, defined as 
centered on the Earth and having the Z axis oriented along 
the North pole and the X axis along the Greenwich meridian: 
rotations between the network frame and the wave frame are 
accomplished by the Euler angles (]),0,\|/; we can set t|/ = 0 
from now on, because it specifies a rotation of the wave frame 
around the Z axis, which is inessential. In turn we can define 


reference frames centered on the detectors (see Section B 1), 


and call ocl.Pl.Yl the Euler angles needed to rotate between 
the network frame a nd th e detector frame relative to the L-th 
detector: see Section B2 for the values of these angles for the 
interferometers under construction. 

We write therefore the signal at the L-th detector as 


s L (t) = h+ (t - X L ) E l + + h x (t- T|_) F* 


(65) 


where Tl is the delay of the signal with respect to what would 
be received by a detector at the center of the Earth: it de¬ 
pends on the direction of the s our ce. The two antenna patterns 
/' L , Ff are given in Section B_l as functions of the (]),0 and 
0 tL.pL.YL angles; we concentrate on burst signals and omit the 
dependence on time of the source location. In other words, the 
angles (]),0 are a function of time and of declination and right 
ascension, while the network frame rotates with the Earth. 

Please note that <f>, 0 should not be confused with the eleva¬ 
tion and azimuth angles 0 S ,(]) S which locate the source in polar 
coordinates with respect to the network frame: the two sets of 
angles are related by 


<bs= T + 0 , 0 , = 7 t — 0 ; 


( 66 ) 


be definitely true if the only noise sources were the fundamen¬ 
tal ones, namely those dictating the baseline sensitivity (with 
the possible exception of the seismic noise for detectors at the 
same site); technical noises, like power line interferences j38||, 
or external noises, like correlated magnetic field fluctuations 
||3% may spoil this assumption. 

Given this simplification, the likelihood ratio is just the 
product of the ratios for the M individual detectors 

M 

A(*| h) = II Al ( x lI s ) (67) 

L—1 

where we borrow from ||t]| a bold-italic notation for the di¬ 
rect sum x = xi © X 2 © ■ ■ ■ © xm of the data vector from the 
individual detectors. A is conditioned by the presence of the 
signal h, described in each detector by Eq. (^5j) in terms of the 
same two polarizations (in the wave frame) h +x : we have 

A|_(X|_|s) = e _ ^( SL ^'( R LL 1 * * * ) , 'j( S L)j + ( S L)^ R LL 1 ' ,£ L )i+<iL (68) 

where we have exploited the time invariance of the correla¬ 
tion matrices, and introduced a shift d\_ in the index of the 
data xl, changing the reference time at detector L in order to 
compensate for the delay T|_ (0, <])); hence the burst signal ap¬ 
pears at the same time in the individual data vectors |^3||. The 
matrix Rll represents the noise autocorrelation for detector L 
and the double index is irrelevant for the time being, but will 
be useful when dealing with cross-detector correlated noise. 
We can express S|_ in terms of the two polarizations, treated as 
independent vector variables: it is convenient to write 

s L [i] = h' [/] ■ F L = (h+ [t\ , h x [/]) • ^ ^jx V, (69) 

hence h should be regarded as a two column matrix, and Sl is 
a vector resulting by contracting one of the indices with those 
in the vector F|_. 

The likelihood results to be 


in the following we will always use, unless explicity stated, 
the Euler angles (]),0. 


A(jc| h) = e “3 h, ’(TLFL®K L L® F L)' h+h, ’EL F L®yL] 


(70) 


B. Network likelihood with uncorrelated noise 

We consider first a simpler case, assuming that the Gaussian 
noise of the individual detectors is uncorrelated. This would 


where we have introduced the 8-filtered data 
YL [*] = ( RQ!-x L ). +dL 
including the di time shift. 


(71) 


I 

The Gaussian integration over the h+,h x vectors can be performed, and we obtain the network log-likelihood 


21nA(x|0,<|>) = 

£ F i_G>yL 

t 

1 __i 

LL 

7=i 

LL 

l_ 

-1 

£ F i_G>yL 


L 

II 

L 


L 


(72) 


as expected, the fact that the signal is coherent across detectors makes it impossible to factor out the integrated likelihood 

in a product of terms. The expression obtained is similar to the one proposed by Anderson et a/.|[i~3[ Eq. (5.29)] apart 
the fact that, as in the case of a single detector, they have chosen the signal prior flat in the metric induced by the matrix 
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El F K < 8 > (Rll ) M < 8 > F' ll 


The formal notation adopted in Eq. ( [72| ) deserves some clar¬ 
ifications: 



_|_f X ) 

contains El^l Tl, a vector combining data from all 
the detectors, weighted with the inverse of the correla¬ 
tion matrix pertinent to each of them, and summed with 
weights depending on the sky direction considered. 


is the correlation matrix of the (vector) signal y. We note that 
the two time series y + and y x are jointly stationary , that is also 
their cross correlation depends just on the relative lag||55|]. 

We can introduce two DKL bases, l|f+ x , k € [l, ivjfj, for 
the two time series y +iX : in terms of these two bases 


® = I 

k,i 


o+v+ ® [v+]' o+ x y ® |y x ]' 
Vx <8 [V+]' °ki X v x 8 [Vx ]' 


(77) 


© = El Fl 8 (Rll)|| 8 Fj_ is a 2 x 7V| xiV|| x 2 matrix, where 
as in the case of the single detector Nu is the dimension 
of the 'Em subspace we are testing for the presence of a 
burst. It is built summing the matrices relative to each 
detector, then it must be inverted and contracted with 
the 2 x 7V|| matrices obtained restricting El Fl ®yL to 
'Em in order to construct the scalar statistic. 

The matrix y can be easily computed, while for ©, as in 
Eq. ( [21~[ ). the following identity holds, in absence of signal: 

£[y®/] = £f k <S>£ [y k® y l] ®Fl 

K.L 

= 52 Fk®R(kl)?® (73) 

K.L 

if the detector noises are statistically independent, it reduces 
to 


E[ y8 /] =X)F l 0(R[ l 1 ) || ®F ? l = 0 (74) 

where we have used R ( L L) y = R LL ' for the correlation matrix 
of the yL time series. 

Now we would like to factor the correlation matrices 
(Rll 1 ) || relative to each detector: however, each of them ad¬ 
mits a different KL expansion over the 'Em subspace 

[ r :l]||=E 0 aVl8 V( (75) 

k 


where the diagonal terms (0j and (a) xx are simple: 

A+ + (XX) o +(x) 

(°)« —OklOf , 


where ajj, C)j are the eigenvalues of the two DKL bases, while 
the off diagonal terms are 


+ x(x+) = y + 

°kl ~ L, K r \ 


V+(x) 


( R kk)|| ' V x (+) 


(78) 


or simply in terms of the estimated cross correlations 


„+x(x+) 

°kl 



•£[y+(x)8y x(+ )] V x(+) 


E 


c 


k r l 

+(x) c x(+) 


(79) 


where c^ x . are the coefficients of the DKL expansion in the 
two bases. Therefore the estimation of the matrix <5 is sim¬ 
ple: once the \|/ + ( X ) eigenvectors are defined, the eigenvalues 
give immediately the diagonal terms O' 1 !x x \ while the cross 
terms are most easily estimated from the data, performing the 
DKL decomposition and cross correlating the coefficients. 

The matrix © can be easily inverted, obtaining 


© 1 


E 


p w v+ 8 [v + \ 1 p ti x v+ 8 y x ] r 

p«Vx 8 [V + f P xx \|/ A x ®y x ] f 


(80) 


and the bases {W* *€[l...Af||]} are generally different 
for each detector| p4||. hence the sum of tensor products in 
Eq. (^3|) does not factor in a patent way into a product of 
terms: however, we are going to show that it can be factored 
introducing two DKL bases. 

1. Exact form for the network statistic 


where p = 0 1 is such that 

E E <fpf:=s^. (si) 

l=lq=+,x 

It is easy to show that the solution for p is (note that 0 ++( xx ) 
are diagonal matrices) 


Let us fully exploit our understanding in Eq. ( |73| ) that the 
matrix 


0=E 


L 


(F +) 2 F+F x 
F+F x (F l x ) 2 



(76) 


p xx = ( 0 xx - 0 x+ -( 0 ++ )“ 1 - 0 +x ) 1 (82a) 

p +x = — (o ++ ) _1 ■ 0 +x ■ p xx (82b) 

with similar expression exchanging x, +, in terms of products 
of Nn x Nu matrices. We can finally write the statistic L = 
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y • (©r 1 • y as a quadratic form over the coefficients of the 
two KL expansions: 


L= E ’ where 

k,l=l p,q 


C P = 




M 

■ (yp)|i = i F K 

K =1 


Vi 


(83a) 

(Rkk ' x k) I, ; (83b) 


just the Fourier transform of the y + x time series. In anal¬ 
ogy with the single detector case it is simple to prove that this 
statistic converges to the excess power statistic defined in | [i~3[ 
Eq. (5.29)] in the limit N\\ —> A, or equivalently when the cross 
correlations between the subspaces ‘Fn , ‘V can be neglected. 


3. Description of the algorithm 


where we should keep in mind that the F' functions depend 
the chosen direction in the sky, and with them both the coeffi¬ 
cients <3 P k 1 and the DKL basis vectors V|/*. 

As argued in Section IIF, the distribution(s) of the DKL 
eigenvalues may allow to approximate the sum neglecting 
terms which would result in noisy contributions: a in-depth 
discussion would require however to consider a realistic noise 
spectrum. 

We will c ome back to the algorithm defined by 
Eqs. (83a 83b) after having exposed the simplifications pos¬ 
sible in the case of large N \\. 


2. Simplified case: large 2Vj| 

As in Section m if Aii is large enough to justify the ap¬ 
proximation of the DKL with a Fourier transform, the matrix 
(R(kk)v) || = (Rkk) || can be written as follows: 


We find useful to briefly outline the detection algorithm in 
the case of uncorrelated noise across the detectors: we will see 
that the changes induced by the correlations can be accommo¬ 
dated easily. We can be sketchy because most steps are similar 


to those described in Section IIH 


Our M-detectors network produces M xN data, represented 
by the vectors xk; we are looking for burst of length A <C A. 
Then we should 


1 . filter the data of the individual detectors for the occur¬ 
rence of 8 events, obtaining new vectors yK; this oper¬ 
ation can be done with the Fourier transform and costs 
0(M x Ain A). 

2. For each direction in the sky, shift the yK vectors to 
compensate for the delays X/, and sum the M vectors 

_|_f X ) 

weighed with the F K polarizations, to obtain y +jX ; 
cost A s k y x 0(M x A) where we have introduced A s k y 
as the number of effectively independent directions. 


fs 


TV,, —1 


( R (KK)y)|| = XT E VK)yW W *®wf 


H 


(84) 


k=0 


where S( KK )y [&] is the one-sided noise spectrum of the (yK )m 
data; its frequency resolution is f s /A|, and it does not de pen d 
on the direction in the sky. Now the matrix © (see Eq. (73)) 
can be factored out and we obtain in analogy with the results 
of Anderson et a/. [[i~3|. Sec. V C] 


EF K ® (R K k)|| ® F r K = y E S v M ® wf (85) 


tVn—1 


k—0 


where we have introduced a network spectral density S y for 
the 8 filtered data y: each element of S y is a 2 x 2 matrix, 
depending on the Euler angles «j> 0 through the Fk terms 


S y [k]=E%K)yW Fk®F' k . (86) 

K 


We can therefore 

rewrite the statistic in Eq. (|72|) as 



7 , N\\~2 


L(x) = 

7T I [yMir PyMrMyW]|i 

iV ll k= 1 

(87a) 


M 


y[k] = 

E Fl -^ l W 

(87b) 


L=1 


in complete analogy with the single detector case, Eq. (p6|); in 
this case y [k] is a 2 elements vector, whose components are 


3. Estimate the \|/ + x DKL bases appropriate for expand¬ 
ing (y +iX )||. This step may not be needed for large A||; 

if required it costs O (n s \^ x N^j. 

4. If using the DKL bases, build 0 1 , that is compute 

the 2 x A|| x A|| x 2 matrix this step re¬ 

quires (9(A s ky x Nu) operations; otherwise, estimate 
the “network” spectral density S v [k] of the y data, 
with frequency resolution f s /A|, for instance averaging 
over the A/A|| possible L|| subspaces: a step costing 

xA||lnA||) +<9(MxA||) 

5. For each possible interval of length A|| in the data vec¬ 
tor, perform the decomposition over the bases t|/ + x 
or alternatively over the Fourier basis: the cost is 
O (A s ky x A x A||) in the first case, O (A s k y x A x InA||) 
in the second case. 


6 . Eval uate the statistic L either using Eq. (83a) or using 
Eq. (87a), depending on the path followed. 


In the large A|| approximation several further optimizations 
are possible: for example the Fourier transform of data yL 
can be done once, and the shifts needed to evaluate the sum 
Ll Fl 0 y l for different sky directions become phase factors 
to be attached to the Fl tensors. The big unknown in this esti¬ 
mation is the number A s ky of independent sky direction which 
should be probed. Given the “spin-2” dependence of the an¬ 
tenna patterns on the sky location, one expects a relatively 
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slow variation, hence it should be possible to sample the solid 
angle on a reduced number of cf>, 0 angles, and possibly exploit 
hierarchical methods to further reduce N s ky- this may be the 
subject of future work. 


have defined in the wave frame. We borrow from|j2f| the no¬ 
tation (with a different normalization) 

<■ a,b) R = a(R)~ 1 b= £ a K [k] (R- 1 ) KLk , b L [l\; (90) 

KLkl 


C. The case of correlated noise among the detectors 


R is a MN x MN matrix which we regard as a tensor 


The problem of correctly writing down the likelihood in the 
case of correlated noise among M detectors has already been 
studied in depth by Finn [|2(]], who has also proposed to ap¬ 
ply a transformation to the M data channels, which would de¬ 
correlate the noise. We face here a technical difficulty, such a 
transformation would also “rotate” the signal and render awk¬ 
ward the bookkeeping in our derivation: we want here to ex¬ 
plore a different approach, motived by the hope that the cross¬ 
correlation terms will result to be significantly smaller than 
the diagonal terms, so that a perturbation expansion is possi¬ 
ble. 

If there is correlated noise, the likelihood for observing x in 
presence of a signal h can be written in full generality as [ | 20 | , 
Sec. Ill B] 


A(jc| h) = exp 



(s,s) R + {s 1 x) R 


( 88 ) 


where we have already defined the symbol x for the M x N 
matrix representing the M time series, each of length N, pro¬ 
duced by the detectors, while 


■s = Si©s 2 ©---©s M (89) 


is the signal at each detector, dependent on h+,h x that we 


R = E [n © n] , 


(91) 


that is 


(-R)klh = (Rkl)*, =E[n K [k + dK]n L [l+d L }] . (92) 

Each NxN matrix Rkl is a Toeplitz matrix depending only 
on k — l + (o?k ^ d\_) and not necessarily symmetric, unless 
K = L; only the symmetry (/?) KUH = (/?) LK;jt holds, which 
ensures that (a, b) = (b. a). Notice also that, similarly to the 
previous sections, we have shifted the labeling of data on each 
detector so that the burst is simultaneous in the time series: we 
must be careful, because the time shifts do not cancel out in 
the cross terms of the correlation matrix. 

The inverse matrix ^ 'is defined, as in ||2C|], such that 

M N 

SuSy =(*-'*) = £ (93) 

K=U=1 

and the network likelihood can be written explicitly as 

A (x| h) = e _ 2' s KW(^ 1 ) K LH iL W+ s KH(« _1 )KLH- ,:L [ ,+rfL l; (94) 

summation is implied over the indices k. I labeling the sam¬ 
ples, and the indices K, L labeling the detectors. 


J 


We have already written in Eq. (69) the explicit form of ,V|_ [/], and we can proceed as before to integrate over the nuisance 
parameters h +tX : we obtain formally a similar expression 


21nA(jc| 0,<» 


t 


^FiOy, 

I 


KL 



•yi 


(95) 


where again the suffix || means restriction of time indices to the Vii subspace; © is a 2 x Nn x Nn x 2 matrix, constructed 
contracting the detector indices in R with the corresponding indices in the 2 x M matrix F, while y || is a 2 x N matrix obtained 
contracting the detector index of F with the corresponding index i n the matrix y. _ 


Notice that yi combines data from the different interferom¬ 
eters: one has by definition 


yi [*']=£(* ij x AJ + d j]> (96) 

J i 


a sort of 8 function filtering for multiple interferometers. 

Given the formal exact solution in Eq. (^5|), we proceed 
with the assumption that the cross correlations among detec¬ 
tors are much smaller than the internal correlations: more pre¬ 


cisely, that 


IM 


«i 


(97) 


where ||A|| = max|| x | 1=1 ||A ■ x|| is the matrix norm induced 
by the standard vector norm | |x| | = ©x ■ x. We split R into a 
block diagonal D, and a off-diagonal O: 


R = D + O 

( D )kL kl = § KL (RKK)„ 


(98a) 

(98b) 
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and expand R 1 in powers of O using the identity 


1. Simplified case: large M 


(D + 0)~ l =D l D 1 0(D + 0) 1 (99) 

which allows to approximate 

R 1 ~ D l D l O [D~ l D l O (D 1 -...)] (100) 

to any desired precision. This expression is useful because the 
inverse of the block diagonal matrix D is simple 


Mku^MRkk)*,, (101) 

in terms of inverse correlation matrices on each detector. 

We can use this expansion to write down an approximate 
likelihood: the 8 filtered data yi< are 


yK — R K K ' Xk V. Rkk • RkL • R| I X L 
L^K 


( 102 ) 


where we understand the shift of the data x, to simplify the 
notation, and we keep only the first order in Rkl- This ex¬ 
pression can be evaluated in the Fourier space 


«['] -jjl. 

^ k= 1 


2 N—2 p—i2nkl/N 


5kk [k] 




L/K 


Sulky 


I (103) 


and can be easily extended to higher orders, leading to 


N-2 


y=^ J Le- i 2 m/N mr 1 -x{k] 

iV A=1 


(104) 


where x is the MxN data “vector” of the network, and S [k] is 
the MxM matrix with elements 


Skl [k] = — wf • Rkl • w*. 
Js 


(105) 


To prevent misunderstandings we underline that this proce¬ 
dure is not a whitening, and it does not correspond to defin¬ 
ing uncorrelated data channels: it is instead the multi-detector 
analogous of filtering for the occurrence of 8-function events. 
Having obtained in this way an approximation to some de- 

y+ 


sired order of the y = 


yx 


= It Fj (g> yj data matrix (of size 


2 x N), we are actually able to proceed with the same meth¬ 
ods exploited in the uncorrelated noise case. In order to test 
for the occurrence of a burst in the subspace V\\, we need to 
restrict the y data to the burst su bspace 'Fit and define there the 
DKL bases \(f +x , as in HI B 1 or Fourier bases if N is large 
enough. 

We are then able to estimate the matrix 0 = E [y (g) y] and 
compute the elements of the G matrix (see Eq. (|T7|)) exactly as 
in that case: the diagonal elements a++( xx ) just as diagonal 

_|_f x ) 

matrices built from the eigenvalues a k , and the off diago¬ 
nal matrices 0 +( x ) using Eq. (f79[): all the remaining derivation 
goes unchanged. 


As in Section IIIB 2, a considerable simplification is possi¬ 
ble if the DKL ¥+(x) bases of the 7'jj space converge to the 
Fourier bases: in this case we have, by the very definition 
0 = E [y (g) y], and in analogy with Eq. (]85|) 


L F ,®(*' 1 ) IJ ®F t J = 
u 

Sy [k] = 


[k]®wf 

z k 

f S++[k] 5 +x [k]\ 
\s x+ [k} S X x [k] ) 


(106a) 

(106b) 


where in turn S pq [k] are the 4 possible cross-spectra, at fre¬ 
quency resolution f s /Nn, defined from the data y+,y x - Hence 
the log-likelihood has exactly the same expression as in the 
uncorrelated noise case 


L ( x ) = I lyWf ■[Sy[*r'-|y[*]]|| (107a) 

iV || k= 1 
M 

y[k] = £f l> \ [k] (i07b) 

L—1 


with the difference that the y k have been obtained combining 
data from the different detectors, in a manner depen dent on the 
direction in the sky; approximately as in Eq. (103), or exactly 
as in Eq. (104). 

The algorithm described in Section III B 3 goes almost un¬ 
changed: the only real change is in the computation of the 
data yK- In particular, we have to rearrange steps (1) and (2) 
because now the 8-filtering must be preceded by the time shift 
of the data xl: the two operations no longer commute. All the 
other steps remain unaltered. 


D. Distribution of the network statistic 


Having given in Section [II] a demonstration that the statis¬ 
tic L is a X 2 distributed variable, we can generalize those re¬ 
sults: we know that the network statistic 


L = y|,..@- 1 .y|| (108) 

is written in terms of the inverse correlation matrix of the vari¬ 
ables y|| themselves, viz. 

® = E [yjiOyn] (109) 


where the average is taken in absence of signal. Hence again 
L is a % 2 variable, as can be directly checked considering its 
first two moments 


E [L\H 0 \ 
E[L 2 \H 0 \ 


E 

2N\ 





(110a) 


2 tr(©0^ 1 0©~ 1 ) + tr(©0 


2A|| (2A|| +2) 


(110b) 
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and generalizing with a bit of combinatorics to 


and, noticing that (R ll ’) [ci, a] = rms (yi_) 


E[L m \H 0 ] = l\[2N ]] +2(k-l)] (Hi) 

tr=l 

which is the moment expansion of a y 2 (2N ) variable. In the 
same way, in presence of a signal the distribution d 1 ( L ) will 
be given by Eq. (^6|) with the substitution Nn 2Nu and the 
SNR defined as 

SNR= !Bh£H. (112) 

V 2 < 2 A, i) 

the extra factor of \/~2 in the denominator results from the def¬ 
inition of the signal in terms of two polarizations. 


E. Example: network sensitivity to 8 events 

“netw ork spectral density” introduced in 
106b) is a 2 x 2 matrix of spectra and (com- 


The 
Eqs. ([ 

plex) cross-spectra, which depends on the sky direction: it is 
therefore interesting to derive some scalar quantity which can 
be plotted in a spherical projection and give a visual idea of 
the sensitivity of the network. 

As a simple example, motivated by the short duration of the 
impulsive features in some of the model waveforms | [i~4| . Fig. 
2 (Model A)], we may consider the response of the network 
to a burst of duration unburst = 1 / f s , where f s is the sampling 
rate in the detectors, and having amplitudes A + , A x in the two 
polarizations. We assume that the noise is uncorrelated across 
the detectors, and that the data have been shifted to capture 
the event in every data streams at the same time index a\ we 
have 


x L [k] = (A + F+ +A X F £) L e ~j 2nka / N 
Js 

and the 8-filtered signal is (see Eq. (51)) 


(113) 


y L [/] = (A + ^ 


+A X E X 


2 e 


Sll [k] 


(114) 


projecting on the ‘lA subspace means setting l = a, hence 


y L [a] = (A+F 2 


+a x e l x ; 


where we have defined 


1 W/2-1 

rms(y L ) = - V 

t U fsN tx 


rms(y L ) 


1 


1 


If 

f} Jf, 


s LL [k] 

f Nyquist df 


SLL(f)’ 


(115) 


(116) 


this is the same quantity resulting from the analysis in [|40||, 
where burst signals with uniform spectrum in the detection 
band had been considered. Next, we have 


: £ rms(y L ) (A+F+ +A X F L ' 


1 L 

F x 


(117) 


@ = £rms(y L ) 


W) 2 

F + F x 

r L r L 


F+F x 

m 2 


(US) 


the log-likelihood statistic L = y|| ■ 0 1 • y || can now be easily 
evaluated: we average over A + ,A X keeping their geometric 

mean A = \1 A 2 + A 2 fixed, and evaluate the resulting SNR 


as in Eq. ( |112| ) for networks of interferometric detectors built 
out of different partitions of the instruments currently under 
commissioning. Because of the f s 2 factor in rms (y l), the 
scale is set by the effective amplitude A dt = A /“ 1 . 

We report in Fig. ([!]) two polar plots of the SNR, obtained 
setting Adt = 10~ 23 s; with 1 = <9 (lms) this would corre¬ 
spond to a strain A = O (10~ 2 ®), possible for a core collapse 
event at a distance of 10 kpc [|9j, [i~4| |. We have considered ei¬ 
ther the network of three LIGOs interferometers, or a network 
including also GEO600, TAMA and Virgo: the details on the 
detectors are reported in Sect ion |B 2| , where the nominal noise 
spectrum is modeled in Eq. ( B16 ) and in Tab. |jl[ while loca¬ 
tions and orientations are reported in Tab. [I] 

The plots have been obtained using a Mathematical note¬ 
book which is available from the author upon request. 

The global interferometric network appears significantly 
more sensitive, and much of the effect is due to the con¬ 
tribution of Virgo: however the result should be considered 
merely illustrative, because the chosen shape of the burst (a 
8 -function) corresponds to a flat spectrum in the frequency 
domain; this choice favors the Virgo detector substantially, as 
already noticed in [ 401, because of the wide bandwidth of the 
model sensitivity of Virgo. 


IV. CONCLUSIONS AND OUTLOOK 

In this paper we have defined a statistic for the detection 
of burst signals, which is well suited to be applied to data af¬ 
fected by colored noise, thus properly generalizing the excess 
power statistic 0 0- 01 to the case in which the spectral 
noise density varies significantly over the frequency band of 
interest, and the signal prior is assumed to be flat in IK' V . It 
is optimal in the Bayes sense, under the two hypotheses that 
the signal is distributed uniformly in amplitude, and is con¬ 
taminated by additive Gaussian noise. The extension to the 
network case was straightforward and it was also possible to 
take into account in a natural way the possible presence of 
Gaussian noise correlations among the detectors, either per- 
turbatively or exactly. 

The lack of assumptions on the GW signal distribution is 
both an advantage and a disadvantage. We believe that the 
proposed statistic is the correct one for a detection strategy 
free of a priori assumptions, apart the duration of the burst; 
yet we are aware that it does not lend itself to easily include as¬ 
sumptions on the amplitude distribution of the signal, as it was 
possible in [|l3[], thus making difficult to set Bayesian thresh¬ 
olds. 

Our generalization is not much more expensive, from the 
computational point of view, than the statistic discussed by 
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Figure 1: Polar plot of the SNR for bursts having Adt = 10~ 23 .s, as a function of the direction of the source. The figure at left refers to the 
network of LIGO detectors, while the figure at right includes also GEO600, TAMA and Virgo. The axes of the network frame are shown: we 
recall that Z points toward the geographical north, and X crosses the Greenwich meridian. 


Anderson et al. 0 0; in its simplest implementation it 
amounts to perform a matched filtering for 8 -functions fol¬ 
lowed by the calculation of a “energy” over the time window 
to be tested for the occurrence of a burst. The whole analysis 
chain can be implemented combining a few standard time- 
series analysis tools and therefore implemented with efficient 
algorithms. The limited computational cost of the single de¬ 
tector search method suggests that it may be applied to the full 
data set, before any triggering is performed: in this sense it is 
a method proposed for the on-line search. 

An evaluation of the actual detection performance, when 
considering theoretical waveforms and simulated noise, re¬ 
mains to be done; this will be the subject of future work, much 
along the lines of [ 010 - 

The cost of network detection was not fully estimated: it 
depends on the number of directions in the sky effectively in¬ 
dependent, which was not studied in this paper. 

Besides the detection of gravitational events, we propose 
this statistic as a tool for excess noise characterization: real 
interferometric detectors are definitely affected by non Gaus¬ 
sian noise [|33|], and as long as the excess noise is dominated 
by burst signals, like Poisson distributed creep events, we can 
think of using this statistic as a tool for detecting and char¬ 
acterizing them. Once a event results in a instance of the L 
statistic above the threshold, and therefore a burst of excess 
noise or a gravitational wave is detected, our algorithm pro¬ 
vides a way to encode this information in a manner optimal 
with respect to the distribution of the noise [[l 8 ]] . In fact, the 
discrete Karhunen-Loeve transform that we have chosen as a 
tool to compute the “energy” over the burst time window is 
equivalent to a principal component analysis: the coefficients 
of the DKLT are ordered by the amount of RMS noise con¬ 
tributed by the corresponding basis vectors. A candidate event 


can therefore be encoded selecting just the largest DKLT co¬ 
efficients, while retaining most of its relevant “energy”, that 
is, the energy that is distributed in less noisy components. 

Moreover, we recall that in absence of signals the DKLT 
coefficients are statistically uncorrelated: one may instead an¬ 
ticipate peculiar, spurious correlations when a signal of any 
nature is present. These correlations will emerge as regular¬ 
ities if the events occur repeatedly in time, and it should be 
possible to catalog them in an automatic way, for instance by 
using clustering methods in the vector space of the DKLT co¬ 
efficients. 
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Appendix A: DFT CONVENTIONS 

We list here our conventions for the discrete time, discrete 
frequency representation of stochastic processes. Let f s be the 
sampling frequency, and N the length of the data sample for 
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the discrete time process x [l }; then the Fourier transform pair 

x «-> x is 


x[k] 

s I 

A i = o 

(Ala) 

x[l] 

^ . 

** 

£ 

<N 

iUi 

II 

(Alb) 


1. Geometry 

We adopt the following reference frames [|V7||: 

network frame centered on Earth and chosen with the Z axis 
aligned along the geographical north, the X axis cross¬ 
ing the Greenwich meridian; 


the one-sided sample spectrum S x [k] is defined by 




\x» Ml 2 


detector frames centered on the beam splitter of each detec¬ 
tor, the Z axis pointing toward the local zenith and the 
(A2) X axis bisecting the detector arms; 


where x, J( <-> (fit * x) is the Fourier transform of a suitably 
windowed realization of x. For large N one has that the cor¬ 
relation function R x (jj^j = (R.r) [a,b\ and the sample spec¬ 
trum are approximately Fourier pairs 

Rx W ^ | l' k [k] e i2nk ^ N - (A3) 

W k=0 1 

or equivalently 

N-l , 

Rv ^fsYj M W A- ® W f > (A4) 

k =0 Z 


wave frame having the Z axis aligned along the direction of 
propagation of the wave, and X axis lying in the (X, Y) 
plane of the network frame. 

Rotations of coordinates from one frame to the other are ex¬ 
pressed in terms of Euler angles, that is 

Xwave — O ((]), 0, \J/) • Xnetwork (Blci) 

Xdetector|_ — ^ Pl_5 Yl_) * ^network > (Bib) 

if 0j,(|). s are elevation and azimuth of the source in the network 
frame, the relation with the Euler angles (|), 0 is 


where wa are the Fourier orthonormal basis vectors 


w k = 


1 

Vn - 


l,co\co 2 V..co 


(N-l)k 


with © = e l2n ! N . For a zero-mean process one has 

(RT 1 ) [a,b] ~ —V -?—e i2n{ - a - b W N 

f s NtxS x [k\ 


(A5) 


(A6) 


and the useful relations 

(Rx-y)[/] * (A7a) 

yV k=0 Z 

(R-‘-y)[/] ~ ± N f-J^y[k]e i2 « kl ' N (A7b) 

where in the second one also y is a a zero mean process. These 
are approximate relations because of the finite N. 


Appendix B: NETWORK MODEL 

In this appendix we summarize the mathematics needed to 
describe a network of detectors, the essential geometric char¬ 
acteristic of the interferometers under construction, and their 
anticipated model spectral densities. 


4» = <K 


-; 0 = k- 0 *. 
2 


(B2) 


The t)/ angle is zero, according to the definition of the wave 
frame. One can introduce the wave tensor 

w(f) = ^ [(h+(t) + ih x (t))e R + (h+(t) -ih x (t))e L ] (B3) 

where the helicity states e^.e/, can be written as 

£l,r = ^ (ex ± /'ey) (g) (e x ±ie Y ) (B4) 


in terms of unit vectors ex, ey specifying the X,Y axes of the 
wave frame as. In the network frame they can also be writ¬ 
ten as Symmetric Trace Free tensors of second rank (STF-2) 



T±2n(W, 0) (fX2„) network (B5) 

where T mn , with m.n = 0,±1,±2 are Gel’fand’s functions of 
rank 2 and depend on the Euler angles (]), 0,\|/ (= 0) needed to 
rotate the network frame to the wave frame. 
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For completeness, the explicit form of the T l mn functions for arbitrary rank is[341 as follows: 
rjp,(4>,e,v) = e-w +m ^p l mn ( cos 0 ), 

{ 1 ;n—m 

Km( m) = 

where m,n G [—1, /]. 


^j y—m •n—m 

1 m—n , 

j{l — m)\(l + n)\ ( 1 — ju)^~" d l ~ n 

\i+n) m+r 

2 l (l-m)\ \] 

(/ + m)\ (1 — n)\ dn l ~ n 



(B 6 ) 

(B7) 


The detector response is encoded by the tensor dL 

d L = sin(2f2 L ) (n (L)1 <g>n (L)1 -n (L)2 (g>n (L)2 ) (B 8 ) 

in terms of the unit vectors aligned along the L-th interferom¬ 
eter arms, and the aperture angle 2Cli of the arms. The factor 
sin (2^) is 1 for all the detectors apart GEO600, where it is 
0.997; consequently we will understand the factor in the fol¬ 
lowing, to simplify the notation. 

The tensor d has a simple expression 


tf[d- (92m)detector] 



(5m2 &m— 2 ) 


(B9) 


in terms of STF-2 tensors in the detector frame. With two 
successive rotations we obtain the coefficients of its expansion 
in terms of STF-2 tensors in the wave frame 


tr[d- (72m) network] 


tf [d • (72m) wave ] 


'15 


r 2 :„(a L ,p L ,y L ) 


8ji - 

-r 2m (a L ,p L ,y L )] (B10a) 

H P™ (^’^O) tr[d- (72n)netw 0 rk] 


-Dm (<|),9,0,oc L , Pl,Yl) (Blob) 


where we have introduced in the last formula a short-hand 
notation. 

Finally the signal at the L-th detector, after proper h- 
reconstruction in order to deconvolve the interferometer re¬ 
sponse function, will be the scalar 


s i{t) = tr [w (f — x L (<f> 0)) - d L ] (Bll) 

where Tl is the delay at the L-th detector with respect to the 
network frame; it can be positive or negative depending on 
the direction of the source. In terms of the (complex) beam 
pattern functions Fjj’ = tr (e/ft • dL) for the two left and right 
wave polarizations 

Ft = Z>_ 2 (<|>,e,0, a L ,p L ,Y L ) (B12a) 

Ft = (Fty = 02(<|>,e, 0,a L ,pL,y L ) (B12b) 

one can alternatively write 

s L (t) = X[(h+{t-T L ) + ih x (t-x L ))Ft] (B13) 

= \ [(/*+ + ihx)D2 (•••) + (h+ ~ ih x )D-2 (■••)] 


or equivalently and more conveniently for our work 


s L (t)=h + F++h x Ft (B14) 

where, reintroducing the aperture angle 

Ft = sin (2f2 L ) 91 [D- 2 (<|>, 0,0, a L , Pl,Yl)] (B15a) 
Ft = sin(2£i L )3[D- 2 (<M,0,a L ,p L ,YL)] . (B15b) 

The given expression for the signal and for the antenna pat¬ 
terns, as stressed in[|37|], is convenient because it keeps in fac¬ 
tor form the rotations among the various frames. 


2. Interferometer network characteristics 


The geometrical characteristics of the detectors considered 
in this study are listed in Table [j] where we quote the lati¬ 
tude north of the equator, the longitude east of Greenwich, 
the azimuths of the X and Y arms of the detectors, measured 
counter-clockwise from the local east, and the a, P,y Euler 
angles needed to rotate coordinates in the network frame to 
coordinates in the detector frame: all the angles are mea¬ 
sured in radians. The orientation data are taken from |[TT1| 
and updated with informations from the web sites of the dif¬ 
ferent collaborations [^ 2 [; the naming of the axes has been 
changed in one case so that all the detectors have azimuth(X 
arm) < azimuth(Y arm). The resulting Euler angles should be 
taken with care because are computed in the approximation 
of spherical Earth and neglecting the elevation of the detector 
sites and the fact that the arms are cords and not tangents of 
the surface. For a more accurate model, please see [ 441. 

We show in Fig. (J2|) the locations of the detectors and the 
reference frames attached to them, from two viewpoints above 
Europe and the United States of America. 

The other important characteristic of the detector is their 
planned sensitivity. We have chosen to include only the base¬ 
line thermal and shot-noise sources, omitting resonances in 
the observation band: the model for the noise spectrum (fil¬ 
tered to deconvolve the response function to gravitational 
waves) is therefore 


S.(f) = 


/ 


' T ^shot 


1 + 


/knei 


(B16) 


where 5 penc i quantifies the thermal noise of the mirror pendular 
mode, above the pendulum resonance; S m i 1Tor quantifies the 
1 /f tail of the internal modes of the mirror, excited by thermal 
noise; S s h ot and fknee parameterize the optical read-out noise. 
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Detector 

Lat. 

Long. 

X azirn. 

Y azim. 

a, p,y Euler angles 

GEO600 

0.911935 

0.171217 

0.377166 

2.02353 

-1.20035, -0.658862, -1.74201 

LIGO Liv. 

0.533373 

-1.58424 

3.45575 

5.02655 

2.04204, -1.03742, 0.013439 

LIGO Han. 

0.810705 

-2.0841 

2.21308 

3.78387 

-2.99848, -0.760091, 0.513301 

TAMA 

0.622733 

2.43543 

3.14159 

4.71239 

2.35619, -0.948063, 2.27696 

Virgo 

0.761487 

0.18326 

1.24791 

2.81871 

-2.03331, -0.809309, -1.75406 


Table I: detector locations and orientations, and Euler angles (approximated) needed to express coordinates in the network frame in terms of 
coordinates in the detector frames. 



Figure 2: The locations of the detectors on Earth, labeled by their initials (H and L for LIGO Hanford and LIGO Livingston); the axes of the 
network frame, labeled X,Y,Z and of the various detector frames are also shown. To the left: view from above Europe. To the right: view from 
above the United States of America. 


In addition to these parameters, we call / se i S m the cutoff be¬ 
low which the seismic noise is supposed to dominate over the 
thermal noise. This simplified model does not include at least 
two important effects, the thermal violin mode resonances and 
the internal mirror resonance peaks, and should be considered 
merely illustrative. 

We report in Tabl e pT| the numerical values of these parame¬ 
ters, deduced from [|43|], and in Fig. (Q) the comparison of the 
different noise spectral densities. 

For the model at hand, the inverse correlation function 

R-\x) « 9t /” e i2 ^-l—df (B17) 

JO S n (/) 


is well approximated by an expression of the form 

Rf l 2 3 4 5 6 7 { T ) — A 0 e“ |T|/x °cos(27i:/oT + (|)o) 

+Aie“ |T|/Tl cos(27t/iT + (j)i) ; (B18) 

we list in Table 0 the values of the decay times To.i and of 
the “ringing” frequencies /o.i deduced from the values in Ta¬ 
ble |u| the small values of the Top simply reflect the absence 
of resonances in the model noise curves. 
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Figure 3: A comparison of the baseline spect ral se nsitivities, in units of Hz -1 / 2 , for the different interferometers currently under commission¬ 
ing, according to the simplified model in Eq. (B16). The two LIGO 4km interferometers “H4K” and “L4K”, located in Hanford and Livingston 
respectively, are supposed to have the same sensitivity. For clarity’s sake we do not display the so called “seismic wall” at / se j sm , which should 
be understood as dictated by Table [b]. 


Detector 

/seism 

•Spend 

•^mirror 

Sshot 

/knee 

GEO600 

50 

4.1e-36 

9e-43 

le-44 

577 

LIGO 2K 

40 

2.1e-35 

2.25e-43 

4.35e-46 

182 

LIGO 4K 

40 

5.6e-36 

3.9e-44 

l.le-46 

83 

TAMA 

50 

6.6e-31 

3.2e-40 

1.78e-42 

500 

Virgo 

4 

9e-37 

4.5e-43 

3.24e-46 

500 


Table II: Parameters cha racter izing the baseline noise of the detectors 
in the network; cfr. Eq. (B16). 


Detector 

TO 

/o 

Tl 

fl 

GEO600 

5.6e-3 

32 

2.7e-4 

44 

LIGO 2K 

2.4e-3 

70 

6.1e-4 

107 

LIGO 4K 

2.5e-3 

83 

l.le-3 

51 

TAMA 

1.4e-3 

141 

3.1e-4 

82 

Virgo 

6.0e-3 

27 

2.2e-4 

293 


Table III: The characteristic ringing frequencies (measured in Hz) 
and correlation times (in sec.), for each interferometer, deduced from 
the simplified noise models we have adopted. 
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